{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction to CrypTen\n",
    "\n",
    "CrypTen is a machine learning framework built on PyTorch that enables you to easily study and develop machine learning models using secure computing techniques. CrypTen allows you to develop models with the PyTorch API while performing computations on encrypted data -- without revealing the protected information. CrypTen ensures that sensitive or otherwise private data remains private, while still allowing model inference and training on encrypted data that may be aggregated across various organizations or users. \n",
    "\n",
    "CrypTen currently uses secure multiparty computation (MPC) (see (1)) as its cryptographic paradigm. To use CrypTen effectively, it is helpful to understand the secure MPC abstraction as well as the semi-honest adversarial model in which CrypTen operates. This introduction explains the basic concepts of secure MPC and the threat model. It then explores a few different use cases in which CrypTen may be applied. \n",
    "\n",
    "\n",
    "## Secure Multi-party Computation\n",
    "\n",
    "Secure multi-party computation is an abstraction that allows multiple parties to compute a function on encrypted data, with protocols designed such that every party is able to access only their own data and data that all parties agree to reveal.\n",
    "\n",
    "Let's look at a concrete example to better understand this abstraction. Suppose we have three parties, A, B and C that each have a private number, and together want to compute the sum of all their private numbers. A secure MPC protocol for this computation will allow each party to learn the final sum; however, they will not learn any information about the other 2 parties' individual private numbers other than the aggregate sum.\n",
    "\n",
    "In secure MPC, the data owner encrypts its data by splitting it using random masks into <i>n</i> random shares that can be combined to reconstruct the original data. These <i>n</i> shares are then distributed between <i>n</i> parties. This process is called <i>secret sharing</i>. The parties can compute functions on the data by operating on the secret shares and can decrypt the final result by communicating the resulting shares amongst each other.\n",
    "\n",
    "\n",
    "## Use Cases\n",
    "In the tutorials, we show how to use CrypTen to four main scenarios:\n",
    "<ul>\n",
    "<li> <b>Feature Aggregation</b>: In the first scenario, multiple parties hold distinct sets of features, and want to perform computations over the joint feature set without sharing data. For example, different health providers may each have part of a patient's medical history, but may wish to use the patient's entire medical history to make better predictions while still protecting patient privacy.</li>\n",
    "    \n",
    "<li> <b>Data Labeling</b>: Here, one party holds feature data while the another party holds corresponding labels, and the parties would like to learn a relationship without sharing data. This is similar to the feature aggregation with the exception that one party has labels rather than other features. For example, suppose that in previous healthcare scenario, one healthcare provider had access to health outcomes data, while other parties had access to health features. The parties may want to train a model that predicts health outcomes as a function of features, without exposing any health infomration between parties.</li> \n",
    "    \n",
    "<li> <b>Dataset Augmentation</b>: In this scenario, several parties each hold a small number of samples, but would like to use all the examples in order to improve the statistical power of a measurement or model. \n",
    "    For example, when studying wage statistics across companies in a particular region, individual companies may not have enough data to make statistically significant hypotheses about the population. Wage data may be too sensitive to share openly, but privacy-preserving methods can be used to aggregate data across companies to make statistically significant measurements / models without exposing any individual company's data. </li>\n",
    "    \n",
    "<li> <b>Model Hiding</b>: In the final scenario, one party has access to a trained model, while another party would like to apply that model to its own data. However, the data and model need to be kept private.  \n",
    "    This can happen in cases where a model is proprietary, expensive to produce, and/or susceptible to white-box attacks, but has value to more than one party. Previously, this would have required the second party to send its data to the first to apply the model, but privacy-preserving techniques can be used when the data can't be exposed. </li> \n",
    "</ul>\n",
    " \n",
    "The tutorials illustrate how CrypTen can be used to model each of these scenarios.\n",
    "\n",
    "\n",
    "## Threat Model\n",
    "When determining whether MPC is appropriate for a certain computation, we must assess the assumptions we make about the actions that different parties can take. The <i>threat model</i> we use defines the assumptions we are allowed to make. CrypTen uses the <i>\"honest-but-curious\"</i> threat model (see (1, 2)). (This is sometimes also referred to as the <i>\"semi-honest\"</i> threat model.)  It is specified by the following assumptions: \n",
    "<ul>\n",
    "    <li> Every party faithfully follows the protocol: i.e., it performs all the computations specified in the program, and communicates the correct results to the appropriate parties specified in the program.</li>\n",
    "    <li> The communication channel is secure: no party can see any data that is not directly communicated to it.</li>\n",
    "    <li> Every party has access to a private source of randomness, e.g., a private coin to toss</li>\n",
    "    <li> Parties may use any data they have already seen and perform arbitrary processing to infer information.</li>\n",
    "</ul>\n",
    "\n",
    "\n",
    "\n",
    "## References\n",
    "(1) Goldreich Oded. 2009. Foundations of Cryptography: Volume 2, Basic Applications (1st ed.). Cambridge University Press, New York, NY, USA.<br>\n",
    "(2) Andrew C. Yao. 1982. Protocols for secure computations. In Proceedings of the 23rd Annual Symposium on Foundations of Computer Science (SFCS '82). IEEE Computer Society, Washington, DC, USA, 160-164."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
